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Set- Theoretic  Problems  of  Null  Completion  in  Relational  Databases 
Arthur  M.  Keller 

Computer  Science  Dept.,  Stanford  University 


ABSTRACT.  When  considering  using  databases  to  rep¬ 
resent  incomplete  information,  the  relationship  between 
two  facts  where  one  may  imply  the  other  needs  to  be 
addressed.  In  relational  databases,  this  question  be¬ 
comes  whether  null  completion  is  assumed.  That  is, 
does  a  (possibly  partially-defined)  tuple  imply  the  ex¬ 
istence  of  tuples  that  are  “less  informative”  than  the 
original  tuple.  We  show  that  no  relational  algebra,  that 
assumes  equivalence  under  null  completion,  can  include 
set-theoretic  operators  that  are  compatible  with  ordi¬ 
nary  set  theory. 

KEYWORDS.  Relational  databases,  null  values,  set  the¬ 
ory. 

CR  Categories.  H.2.1,  H.l.l,  E.4. 

1.  Introduction 

Considerable  work  has  been  done  in  incomplete  in¬ 
formation  [ANSI  75,  Codd  79,  Goldstein  81,  Grant  77, 
79,  Imiclinski  81,  83,  Keller  84a,  84b,  Lien  79,  Lipski 
79,  Maier  83,  Reiter  80,  Vassiliou  79,  Zaniolo  82].  No 
solution  has  been  completely  satisfactory.  One  prob¬ 
lem  with  many  of  the  proposed  solutions  is  that  they 
are  incompatible  with  the  rules  of  ordinary  set  theory. 
We  show  that  no  solution  that  includes  the  concept  of 
null  completion  [Zaniolo  82]  can  possibly  be  compatible 
with  ordinary  set  theory. 

One  question  that  arises  when  considering  incom¬ 
plete  information  is  the  relationship  between  facts  and 
partial  versions  of  those  facts.  For  example,  the  f-ct 
“Marty  has  been  married  to  Barbara  for  seven  years” 
includes  the  facts  “Marty  is  married  to  Barbara”  and 
“Marty  has  been  married  for  seven  yean.” 

Consider  the  following  relation  encoding  the  three 
facts  mentioned  above,  with  the  functional  dependen¬ 
cies  Husband  — »  Wife,  YearsMarried  and  Wife  — *  Hus¬ 
band  and  YearsMarried.  . 

This  work  was  supported  in  part  by  contract  N00039-82-G-0230 
(the  Knowledge  Bose  Management  Systems  Project,  Prof.  Gio 
Wiederhold,  Principal  Investigator)  from  the  Defense  Advanced 
Research  Projects  Agency  and  by  contract  AFOSR-80-0212  (Uni¬ 
versal  Relations,  Prof.  Jeff  UUman,  Principal  Investigator)  from 
the  Air  Force  Office  of  Scientific  Research,  both  of  the  United 
States  Department  of  Defense.  The  views  and  conclusions  con¬ 
tained  in  this  document  are  those  of  the  authors  and  should  not 
be  interpreted  aa  representative  of  the  official  policies  of  DARPA 
or  the  US  Government. 

Author’s  address:  Computer  Science  Department,  Stanford  Uni¬ 
versity,  Stanford,  CA  04305-2088. 


Husband 

Wife 

YearsMarried 

Narty 

Barbara 

7 

Marty 

<nuil> 

7 

<null> 

Barbara 

7 

The  information  contained  in  the  first  tuple  in¬ 
cludes  the  information  contained  in  the  other  two  tu¬ 
ples.  In  a  static  database,  all  queries  answerable  from 
the  entire  database  are  also  answerable  from  the  first 
tuple  alone.  However,  if  we  delete  the  first  tuple  (say, 
in  an  update  asserting  that  Marty  is  not  married  to 
Barbara)  but  retain  the  other  two,  we  can  still  answer 
some  queries  asking  how  long  Marty  or  Barbara  have 
been  married  (but  not  to  each  other). 

The  above  example  illustrates  the  principle  of  null 
completion  [Zaniolo  82,  Maier  83].  Tuple  t  is  at  least 
as  informative  as  tuple  s  (written  t  >  s)  if  the  non¬ 
null  attributes  of  s  have  the  same  values  as  in  ~t.  A 
database  D\  is  at  least  as  informative  as  database  Dj 
if  for  every  tuple  of  Dj  there  is  a  corresponding  tuple 
in  D\  that  is  at  least  as  infonhative.  The  principle  of 
null  completion  says  that  two  databases  are  equivalent 
if  they  are  equally  informative. 

2.  Definitions 

We  will  use  nulls  to  indicate  that  no  information  is 
known.  Such  nulls  do  not  distinguish  between  the  at¬ 
tribute  being  inapplicable  to  the  tuple  and  the  value  is 
known  to  lie  in  a  particular  set.  A  tuple  t  is  at  least  as 
defined  as  tuple  a  (f  >  a),  if  all  non-null  attributes  in  « 
have  matching  non-null  values  in  t.  The  null  completion 
of  relation  R  =  { t  |  (»  e  R)  A  (t  <  s) }.  “No  informa¬ 
tion”  nulls  and  null  completion  are  Zaniolo’s  concepts 
[Zaniolo  82]. 

Null  completion  induces  equivalence  classes  of  rela¬ 
tions  for  each  relation  schema.  We  designate  the  equiv¬ 
alence  class  containing  the  relation  R  as  R.  We  distin¬ 
guish  two  particular  relations  which  are  members  of  R: 
The  maximal  representative  of  R  is  defined  by 

R*  =  {t|(3seR)(t<s)}. 

The  minimal  representative  of  ft  is  defined  by 
R.  =  (t  6  R  |  (fia  e  R)(»  ?  t  As  >  t)  }. 

We  now  consider  definitions  of  the  set  theoretic  op¬ 
erations.  We  can  define  operations  on  these  equivalence 
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classes  that  we  will  call  union,  intersection,  and  differ* 
ence.  These  operations  take  two  equivalence  classes  and 
result  in  an  equivalence  class.  A  fourth  operation,  mem¬ 
bership,  takes  a  tuple  and  an  equivalence  class  and  has 
a  boolean  result.  We  will  use  the  symbols  0,  O,  and 
—  to  represent  Zaniolo's  (82]  operators  on  these  equiva¬ 
lence  classes,  while  U,  O,  and  —  will  represent  arbitrary 
definitions  of  these  operators  under  certain  constraints 
(to  be  described  later).  Similarly,  6  and  €  are  symbols 
for  membership.  (Choice  of  a  particular  membership 
operator  constrains  the  choices  of  the  other  three  oper¬ 
ators.)  Although  these  operators  apply  to  equivalence 
classes — the  extended  relations,  their  intuitive  meaning 
is  based  on  how  they  operate  on  the  members  of  the 
equivalence  classes— the  ordinary  relations  with  nulls. 
Therefore,  we  will  often  write  A  U  S  when  we  mean 
R  U  S,  etc.,  but  we  remember  that  the  results  of  both 
of  them  are  extended  relations.  Since  these  operators 
are  well  defined  on  the  equivalence  classes,  it  does  not 
matter  which  representative  relation  is  used. 

Using  our  terminology,  we  rephrase  Zaniolo's  defi¬ 
nitions  of  union,  difference,  and  x-intersection  by  sup¬ 
plying  representative  sets  of  the  resultant  equivalence 
classes.  Union  is  defined  (R  U  5)*  =  A*  U  5*.  Differ¬ 
ence  is  A  —  5  »  A*  —  S'.  We  define  x-intersection  by 
(R  n  5)*  =  A*  n  S*.  The  definition  of  membership  is 
r€  A  =  r€  A*. 

Let  us  consider  a  few  examples.  We  will  use  X  to 
indicate  a  null  value.  Suppose  R  is  (at,  bl).  Then  R* 
consists  of  (al,  bl),  (JL,  bl),  (al,  X),  and  (X,  X);  A. 
consists  only  of  (al,  bl).  Suppose  S  is  (al,  X).  Then 
5*  consists  of  (al,  X)  and  (X,  X);  S,  consists  only  of 
(al,  X).  Note  that  R  Cl  S  =  R  and  R  n  S  =  S.  Also, 
T  =  R  -  S  =  {  (al,  bl),  (X,  bl)  }.  It  is  interesting  to 
note  that  t  =  R. 

With  the  above  examples,  we  can  illustrate  that 
these  definitions  do  not  satisfy  some  bask  theorems  in 
set  theory  adapted  to  these  extended  relations.  In  par¬ 
ticular,  R  -  (R  -  S)  =  R  (\  S  [Halm os  60]  is  not 
satisfied:  the  left  side  is  0  and  the  right  side  is  S. 

Another  problem  with  Zaniolo’s  approach  is  that 
the  definitions  do  not  reduce  to  the  standard  definitions 
when  only  fully  defined  relations  are  used.  For  example, 
suppose  R  is  (al,  bl)  and  5  is  (al,  b2).  Then  R  n  S  is 
(al,  X),  while  AH  5  is  0. 

We  can  consider  alternative  definitions  of  the  set- 
theoretic  operators,  attempting  to  find  a  group  that 
satisfy  the  basic  theorems  of  set  theory.  (For  exam¬ 
ple,  R  f*l  S  -  (R.  n  S*)  U  (A*  n  S.)  reduces  to  the 
standard  definition  for  fully  defined  relations,  but  still 
violates  A  -  (A  -  S)  «  A  /*>  S.) 


3.  Theorem 

Let  us  consider  some  requirements  for  set-theoretic 
operators.  The  following  5  axioms  are  adapted  from 
ordinary  set-theory  (Halmoe  60]  for  extended  relations. 


(t  €  A)  =  (t  €  S)  ~  A  A  S  (1) 

((t  6  A)  -» (t  €  S))  «  A  C  S  (2) 

t€A0S~t€Avte5  (3) 

t€Af*lS~t€AAteS  (4) 

t€A-S~feAAt*S  (5) 


It  is  interesting  to  note  that  Zaniolo’s  definitions  satisfy 
all  of  these  axioms  except  for  (5). 

We  shall  also  require  several  soundness  criteria. 
First,  if  a  tuple  is  a  member  of  every  relation  in  an 
equivalence  class,  it  must  be  in  the  extended  relation, 
and  if  a  tuple  is  a  member  of  an  extended  relation,  it 
must  be  in  some  relation  in  the  corresponding  equiva¬ 
lence  class. 

teA.  -*t€A-*t€A*  (I) 

Second,  extended  relations  preserve  set  inclusion. 

A  C  S  A  d  S  (II) 

Third,  since  extended  relations  are  based  on  relative  in¬ 
formation  content,  only  a  tuple  that  is  more  informative 
can  determine  membership. 

t€A-t6{reA|r>t}  (III) 

The  fourth  criterion  requires  compactness:  if  a  tuple  is 
a  member  of  an  extended  relation,  it  can  be  traced  to 
a  single  tuple  in  the  original  relation. 

t€  A— >(3r€A)(t€{r})  (IV) 

Note  that  the  leftward  implication  of  the  last  two  crite¬ 
ria  can  be  derived  from  (2)  and  (II).  Zaniolo’s  definitions 
satisfy  all  four  of  these  last  criteria. 

We  shall  show  that  no  definition  for  the  four  set- 
theoretic  operators  defined  on  extended  relations  can 
be  compatible  with  (l)-(5)  and  (I)-(IV). 

Theorem.  No  definitions  of  the  membership,  intersec¬ 
tion,  union,  and  difference  operators,  respectively,  de¬ 
fined  on  extended  relations  are  compatible  with  (l)-(5) 
and  (I)-(IV). 


Proof  (by  contradiction).  Suppose  that  a  set  of 
such  definitions  exists.  Let  6,  A,  U,  and  -  be  member¬ 
ship,  intersection,  union,  and  difference  operators,  re¬ 
spectively,  defined  on  extended  relations  that  are  com¬ 
patible  with  (i)-(S)  and  (I)-(IV). 

From  (l)-(5),  we  can  derive  the  following  theorems 


we  shall  use  [Halm os  60). 

Rc.SARcT-*Rc(Sf\T)  (6) 

(R  -  S)  A  (R  A  S)  =  0  (7) 

R- S  R£  S  (8) 

(fl  C  S)  A  (S  C  R)  «-*  (fl  =  S)  (9) 

(t  6  R)  A  (fl  C  S)  -  (t  €  S)  (10) 

Lemma.  R,  n  S.  C  R  A  S. 


Proof  of  lemma.  We  first  note  that  R .  fl  5.  C  fl. . 
By  (II),  R.CiS.C  R.  Similarly,  fl.  n  5.  C  S.  By  (6), 

fl.  n  s.  c  fl  h  s. 

Intersection.  We  will  show  that  fl  A  S  =  fl.  O  S.. 
Suppose  that  fl  A  S  is  not  equivalent  to  fl.  D  5.  for 
some  fl  and  S.  Then  by  the  lemma,  there  exists  a  tuple 
t  6  fl  A  S  and  t  $  fl.  n  S..  Since  t  ^  fl.  O  S„  either 
1 4  fl.  or  t  i  S..  ( If  t  €  fl.  and  t  €  9.,  then  t  €  fl.DS., 
and  by  (I),  t  €  fl.  H  S..)  We  define  fl'  =  {  r  €  fl.  |  r  > 
tAte{r}}  and  also  S' =  (i€  S.  |  s  >  t  At  €  {<}}. 
By  (III)  and  (IV),  t  G  fl'  and  t  #  S’.  Then  from  (4) 
we  obtain  that  t  €  fl*  1*1  S'.  Suppose  fl*  -  S'  =  0. 
Then  by  (8),  fl*  c  S'.  Similarly,  S'  -  fl'  =  0  implies 
S'  C  fl'. 

Case  I.  Both  differences  are  empty.  Then  by  (9), 
fl'  =  S'.  Since  minimal  representative  of  an  equiva¬ 
lence  class  is  unique,  fl*.  =  S[.  But  since  fl',  sa  fl' 
and  S',  =  S'  (a  subset  of  a  minimal  representative  is 
still  a  minimal  representative  (although  of  a  different 
equivalence  class)),  R  =  S'.  Then  since  fl*  C  fl.  and 
S'  CS.,fl'cfl.n  S..  Then  t  €  fl.  fl  S..  This  is  a 
contradiction. 

Case  II.  At  least  one  of  the  differences  is  non-empty. 
Without  loss  of  generality,  assume  that  R  —  S'  is  non¬ 
empty.  Then  let  r  4  fl*  -  S'.  Then  r  6  fl'  (by  5)  and 
(r)  c  R  -S'.  (By  (2)  and  (I),  t  6  T  -  (t)  C  T.) 
We  defined  A4  above  so  that  t  4  { r  }.  Therefore,  t  € 
fl*  -  S'.  By  (4)  and  (10),  t  4  (fl*  -  S')  A  (fl*  f*l  S'). 
This  contradicts  (7). 

Since  both  cases  result  in  a  contradiction,  we  have 
shown 

fl  A  s  =  fl.  n  s.  (•) 

Membership.  We  will  show  that  t  e  t «-» t  6  T,.  Sup¬ 
pose  that  t  €  f  but  t  $  T,.  Then  define  fl  =  {  r  e  T,  | 


r  ^  t  A  t  €  {r}}  and  let  S  =  (t).  By  (III)  and  (IV), 
t  €  fl.  By  (I),  t  e  S.  Therefore,  by  (4),  t  €  fl  A  S. 
Since  t  £  T.,  t  g  A.  (Note  that  fl.  *  fl  and  S.  =  9.) 
Therefore  fl.  fl  S.  =  0.  Consequently,  t  ^  fl.  D  S..  But 
this  contradicts  (*).  Therefore  we  have  shown 

t  6  f  ~ t  €  T.  (**) 

Union.  We  will  show  that  fl  LI  S  =  fl,  U  S..  t  6  fl  0 
S«-»t€flVt€S*-*t€fl.  VieS.  «-*t€fl.  US.  «-♦ 
f  £  fl,  U  S,.  Thus,  we  have  shown 

fl  0  S  =  fl.  U  S.  (*  ♦  *) 

Contradiction.  Let  fl  =  { (a,  X)  }  and  S  =  {  (a,  b)  }. 
We  observe  that  fl.  =  fl  and  S.  =  S.  Now  consider 
fl  U  S.  Using  (***),  T  =  fl  U  S  =  {  (a,  b),  (a,  _L)  }. 
But  (a,  -L)  is  less  informative  than  (a,  b).  Therefore, 
T.  =  {  (a,  b)  }.  We  note  that  (a,  X)  €  fl,  yet  (a,  X)  £ 
T  by  (*•).  This  is  a  contradiction.  | 

We  have  shown  that  no  definitions  of  the  four  set- 
theoretic  operators  compatible  with  extended  relations 
can  be  compatible  with  traditional  set  theory. 

4.  Conclusion 

Relational  database  theory  relies  heavily  on  ordi¬ 
nary  set  theory.  Intuitively,  null  completion  appears 
to  be  important  for  dealing  with  nulls.  The  previously 
proposed  approaches  that  incorporated  null  completion 
were  not  compatible  with  set  theory,  but  it  was  not 
known  whether  a  compatible  approach  existed.  We 
have  shown  that,  if  we  adopt  null  completion,  our  set 
theoretic  operators  cannot  behave  according  to  the  in¬ 
tuitive  rules  of  ordinary  set  theory.  We  are  faced  with 
the  Hobson’s  choice  between  giving  up  our  intuitive  def¬ 
initions  of  set  theoretic  operators  and  giving  up  null 
completion.  We  suggest  that  future  work  attempt  to 
compensate  for  the  loss  of  null  completion  in  order  to 
save  the  familiar  definitions  of  set  theoretic  operators. 
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